MF-Ontology: an Ontology for the Text Mining Domain
نویسندگان
چکیده
Text mining (TM) has emerged as a definitive technique for knowledge acquisition from text. The TM process is based on several phases that prepare the text for mining, process the text, and analyze the results. Effective and efficient use of the combination of TM algorithms and techniques is a challenge. Most of the research is focused on developing new data structures, algorithms and methods to achieve that. However, the TM process is still lacking of modeling support. The TM analyst faces many options when modeling a TM process. For instance, the analyst needs to choose the most effective solution to extract the desired knowledge. This is a complex decision involving choices for each one of the TM process phases where many algorithms and implementations are available for composition and several parameters must be tuned. This scenario tends to be chaotic and each time a new modeling starts, all this ad-hoc process is repeated. A first step towards this modeling is to add semantics to the TM process and register modeling results. The use of ontologies to describe the TM domain can help to structure the systematic composition of algorithms and techniques of the text mining process. By adopting the same structure, similar modeling can be identified and reuse of TM software components (web services, local applications) is facilitated. In this paper we describe the MF-Ontology, an ontology for the modeling of activity flow tailored to the TM domain. MF-Ontology that can be used to simplify the development of knowledge discovery applications based on texts. It represents a reference model to the different phases of text mining tasks, methodologies and software available in order to solve a problem. Thus, MF-Ontology offers semantic help for the TM analyst in finding the most appropriate solution. We describe the design of the MF-Ontology and analyze its different levels of abstraction to semantically represent the TM process. We also present an evaluation of MF-Ontology and show techniques for revising the ontology concepts based on interviews with specialists.
منابع مشابه
Query Architecture Expansion in Web Using Fuzzy Multi Domain Ontology
Due to the increasing web, there are many challenges to establish a general framework for data mining and retrieving structured data from the Web. Creating an ontology is a step towards solving this problem. The ontology raises the main entity and the concept of any data in data mining. In this paper, we tried to propose a method for applying the "meaning" of the search system, But the problem ...
متن کاملPresenting a method for extracting structured domain-dependent information from Farsi Web pages
Extracting structured information about entities from web texts is an important task in web mining, natural language processing, and information extraction. Information extraction is useful in many applications including search engines, question-answering systems, recommender systems, machine translation, etc. An information extraction system aims to identify the entities from the text and extr...
متن کاملخوشهبندی اسناد مبتنی بر آنتولوژی و رویکرد فازی
Data mining, also known as knowledge discovery in database, is the process to discover unknown knowledge from a large amount of data. Text mining is to apply data mining techniques to extract knowledge from unstructured text. Text clustering is one of important techniques of text mining, which is the unsupervised classification of similar documents into different groups. The most important step...
متن کاملPrioritize the ordering of URL queue in Focused crawler
The enormous growth of the World Wide Web in recent years has made it necessary to perform resource discovery efficiently. For a crawler it is not an simple task to download the domain specific web pages. This unfocused approach often shows undesired results. Therefore, several new ideas have been proposed, among them a key technique is focused crawling which is able to crawl particular topical...
متن کاملApproach for managing ontology evolution by using Text Mining Techniques
The maintenance of the domain ontology or a knowledge model after the appearance of changes in the studied domain is an essential stage. Several studies provide methodologies for the maintenance of ontology but only some of them deal with ontologies that are created from texts. Text mining techniques provide good results when the processing of texts is done for the purpose of modeling or classi...
متن کامل